Goto

Collaborating Authors

 median mean median mean


Sampling 3DMolecular Conformers with Diffusion Transformers

Neural Information Processing Systems

Diffusion Transformers (DiTs) have demonstrated strong performance in generative modeling, particularly in image synthesis, making them a compelling choice for molecular conformer generation. However, applying DiTs to molecules introduces novel challenges, such as integrating discrete molecular graph information with continuous 3D geometry, handling Euclidean symmetries, and designing conditioning mechanisms that generalize across molecules of varying sizes and structures. We propose DiTMC, a framework that adapts DiTs to address these challenges through a modular architecture that separates the processing of 3D coordinates from conditioning on atomic connectivity. To this end, we introduce two complementary graph-based conditioning strategies that integrate seamlessly with the DiT architecture. These are combined with different attention mechanisms, including both standard non-equivariant and SO(3)-equivariant formulations, enabling flexible control over the trade-off between between accuracy and computational efficiency. Experiments on standard conformer generation benchmarks (GEOMQM9, -DRUGS, -XL) demonstrate that DiTMC achieves state-of-the-art precision and physical validity. Our results highlight how architectural choices and symmetry priors affect sample quality and efficiency, suggesting promising directions for large-scale generative modeling of molecular structures.




A Proof of proposition

Neural Information Processing Systems

Let's assume we apply a random CCW torsion rotation of angle We detail here the formulae used in section section 2.4. Similar to AlphaFold [Senior et al., 2020], we fit distances using normal distributions and angles Such cases require a special treatment. So far, we haven't tackled the following difficulty: Examples are hydrogen groups as in Figure 1. We propose a new loss function based on eq. The EMD computation cannot be parallelized in mini-batches in the current version of the library, but everything else is batch-parallelizable in our model (e.g., The training stage happens without assembling the full conformer.



Appendix A Visualizations GeoMol GeoDiff RDkit RMCF

Neural Information Processing Systems

Figure 6: Examples of generated molecules from GEOM-Drugs dataset. For every model and molecule, we show three ground truths and the best-aligned conformations. Cat( | n |,P) end for add X to S end for compute the pairwise distance K with Eq.11 S Max Training Steps 1.2 10


A Proof of proposition

Neural Information Processing Systems

Let's assume we apply a random CCW torsion rotation of angle We detail here the formulae used in section section 2.4. Similar to AlphaFold [Senior et al., 2020], we fit distances using normal distributions and angles Such cases require a special treatment. So far, we haven't tackled the following difficulty: Examples are hydrogen groups as in Figure 1. We propose a new loss function based on eq. The EMD computation cannot be parallelized in mini-batches in the current version of the library, but everything else is batch-parallelizable in our model (e.g., The training stage happens without assembling the full conformer.


InPO: Inversion Preference Optimization with Reparametrized DDIM for Efficient Diffusion Model Alignment

arXiv.org Artificial Intelligence

Without using explicit reward, direct preference optimization (DPO) employs paired human preference data to fine-tune generative models, a method that has garnered considerable attention in large language models (LLMs). However, exploration of aligning text-to-image (T2I) diffusion models with human preferences remains limited. In comparison to supervised fine-tuning, existing methods that align diffusion model suffer from low training efficiency and subpar generation quality due to the long Markov chain process and the intractability of the reverse process. To address these limitations, we introduce DDIM-InPO, an efficient method for direct preference alignment of diffusion models. Our approach conceptualizes diffusion model as a single-step generative model, allowing us to fine-tune the outputs of specific latent variables selectively. In order to accomplish this objective, we first assign implicit rewards to any latent variable directly via a reparameterization technique. Then we construct an Inversion technique to estimate appropriate latent variables for preference optimization. This modification process enables the diffusion model to only fine-tune the outputs of latent variables that have a strong correlation with the preference dataset. Experimental results indicate that our DDIM-InPO achieves state-of-the-art performance with just 400 steps of fine-tuning, surpassing all preference aligning baselines for T2I diffusion models in human preference evaluation tasks.


NExT-Mol: 3D Diffusion Meets 1D Language Modeling for 3D Molecule Generation

arXiv.org Artificial Intelligence

3D molecule generation is crucial for drug discovery and material design. While prior efforts focus on 3D diffusion models for their benefits in modeling continuous 3D conformers, they overlook the advantages of 1D SELFIES-based Language Models (LMs), which can generate 100% valid molecules and leverage the billion-scale 1D molecule datasets. To combine these advantages for 3D molecule generation, we propose a foundation model -- NExT-Mol: 3D Diffusion Meets 1D Language Modeling for 3D Molecule Generation. NExT-Mol uses an extensively pretrained molecule LM for 1D molecule generation, and subsequently predicts the generated molecule's 3D conformers with a 3D diffusion model. We enhance NExT-Mol's performance by scaling up the LM's model size, refining the diffusion neural architecture, and applying 1D to 3D transfer learning. Notably, our 1D molecule LM significantly outperforms baselines in distributional similarity while ensuring validity, and our 3D diffusion model achieves leading performances in conformer prediction. Given these improvements in 1D and 3D modeling, NExT-Mol achieves a 26% relative improvement in 3D FCD for de novo 3D generation on GEOM-DRUGS, and a 13% average relative gain for conditional 3D generation on QM9-2014. Our codes and pretrained checkpoints are available at https://github.com/acharkq/NExT-Mol.


Two-stage hybrid models for enhancing forecasting accuracy on heterogeneous time series

arXiv.org Artificial Intelligence

Compared to local models built in a series-by-series manner, global models leverage relevant information across time series, resulting in improved forecasting performance and generalization capacity. Constructing global models on a set of time series is becoming mainstream in the field of time series forecasting. However, the advantages of global models may not always be realized when dealing with heterogeneous data. While they can adapt to heterogeneous datasets by increasing the model complexity, the model cannot be infinitely complex due to the finite sample size, which poses challenges for the application of global models. Additionally, determining whether the time series data is homogeneous or heterogeneous can be ambiguous in practice. To address these research gaps, this paper argues that the heterogeneity of the data should be defined by the global model used, and for each series, the portion not modelled by the global model represents heterogeneity. It further proposes two-stage hybrid models, which include a second stage to identify and model heterogeneous patterns. In this second stage, we can estimate either all local models or sub-global models across different domains divided based on heterogeneity. Experiments on four open datasets reveal that the proposed methods significantly outperform five existing models, indicating they contribute to fully unleash the potential of global models on heterogeneous datasets.